0/1 Knapsack (medium)
Introduction #
Given the weights and profits of ‘N’ items, we are asked to put these items in a knapsack which has a capacity ‘C’. The goal is to get the maximum profit out of the items in the knapsack. Each item can only be selected once, as we don’t have multiple quantities of any item.
Let’s take the example of Merry, who wants to carry some fruits in the knapsack to get maximum profit. Here are the weights and profits of the fruits:
Items:
{ Apple,
Orange,
Banana,
Melon
}
Weights:
{ 2, 3,
1, 4
}
Profits:
{ 4, 5,
3, 7
}
Knapsack
capacity:
5
Let’s try to put various combinations of fruits in the knapsack, such that their total weight is not more than 5:
Apple +
Orange
(total
weight
5) =>
9 profit
Apple +
Banana
(total
weight
3) =>
7 profit
Orange +
Banana
(total
weight
4) =>
8 profit
Banana +
Melon
(total
weight
5) =>
10
profit
This shows that Banana + Melon is the best combination as it gives us the maximum profit and the total weight does not exceed the capacity.
Problem Statement #
Given two integer arrays to represent weights and profits of ‘N’ items, we need to find a subset of these items which will give us maximum profit such that their cumulative weight is not more than a given number ‘C’. Each item can only be selected once, which means either we put an item in the knapsack or we skip it.
Basic Solution #
A basic brute-force solution could be to try all combinations of the given items (as we did above), allowing us to choose the one with maximum profit and a weight that doesn’t exceed ‘C’. Take the example of four items (A, B, C, and D), as shown in the diagram below. To try all the combinations, our algorithm will look like:
Here is a visual representation of our algorithm:
All green boxes have a total weight that is less than or equal to the capacity (7), and all the red ones have a weight that is more than 7. The best solution we have is with items [B, D] having a total profit of 22 and a total weight of 7.
Code #
Here is the code for the brute-force solution:
Time and Space complexity #
The time complexity of the above algorithm is exponential , where ‘n’ represents the total number of items. This can also be confirmed from the above recursion tree. As we can see, we will have a total of ‘31’ recursive calls – calculated through , which is asymptotically equivalent to .
The space complexity is . This space will be used to store the recursion stack. Since the recursive algorithm works in a depth-first fashion, which means that we can’t have more than ‘n’ recursive calls on the call stack at any time.
Overlapping Sub-problems: Let’s visually draw the recursive calls to see if there are any overlapping sub-problems. As we can see, in each recursive call, profits and weights arrays remain constant, and only capacity and currentIndex change. For simplicity, let’s denote capacity with ‘c’ and currentIndex with ‘i’:
We can clearly see that ‘c:4, i=3’ has been called twice. Hence we have an overlapping sub-problems pattern. We can use Memoization to efficiently solve overlapping sub-problems.
Top-down Dynamic Programming with Memoization #
Memoization is when we store the results of all the previously solved sub-problems and return the results from memory if we encounter a problem that has already been solved.
Since we
have two
changing
values (capacity
and
currentIndex)
in our
recursive
function
knapsackRecursive(),
we can
use a
two-dimensional
array to
store
the
results
of all
the
solved
sub-problems.
As
mentioned
above,
we need
to store
results
for
every
sub-array
(i.e.
for
every
possible
index
‘i’) and
for
every
possible
capacity
‘c’.
Code #
Here is the code with memoization (see changes in the highlighted lines):
Time and Space complexity #
Since
our
memoization
array
dp[profits.length][capacity+1]
stores
the
results
for all
subproblems,
we can
conclude
that we
will not
have
more
than
subproblems
(where
‘N’ is
the
number
of items
and ‘C’
is the
knapsack
capacity).
This
means
that our
time
complexity
will be
.
The above algorithm will use space for the memoization array. Other than that we will use space for the recursion call-stack. So the total space complexity will be , which is asymptotically equivalent to
Bottom-up Dynamic Programming #
Let’s
try to
populate
our
dp[][]
array
from the
above
solution
by
working
in a
bottom-up
fashion.
Essentially,
we want
to find
the
maximum
profit
for
every
sub-array
and for
every
possible
capacity.
This
means
that
dp[i][c]
will
represent
the
maximum
knapsack
profit
for
capacity
‘c’
calculated
from
the
first
‘i’
items.
So, for each item at index ‘i’ (0 <= i < items.length) and capacity ‘c’ (0 <= c <= capacity), we have two options:
-
Exclude
the
item
at
index
‘i’.
In
this
case,
we
will
take
whatever
profit
we
get
from
the
sub-array
excluding
this
item
=>
dp[i-1][c] -
Include
the
item
at
index
‘i’
if
its
weight
is
not
more
than
the
capacity.
In
this
case,
we
include
its
profit
plus
whatever
profit
we
get
from
the
remaining
capacity
and
from
remaining
items
=>
profit[i] + dp[i-1][c-weight[i]]
Finally, our optimal solution will be maximum of the above two values:
dp[i][c] = max (dp[i-1][c], profit[i] + dp[i-1][c-weight[i]])
Let’s draw this visually and start with our base case of zero capacity:
Code #
Here is the code for our bottom-up dynamic programming approach:
Time and Space complexity #
The above solution has the time and space complexity of , where ‘N’ represents total items and ‘C’ is the maximum capacity.
How can we find the selected items? #
As we know, the final profit is at the bottom-right corner. Therefore, we will start from there to find the items that will be going into the knapsack.
As you remember, at every step we had two options: include an item or skip it. If we skip an item, we take the profit from the remaining items (i.e. from the cell right above it); if we include the item, then we jump to the remaining profit to find more items.
Let’s understand this from the above example:
- ‘22’ did not come from the top cell (which is 17); hence we must include the item at index ‘3’ (which is item ‘D’).
- Subtract the profit of item ‘D’ from ‘22’ to get the remaining profit ‘6’. We then jump to profit ‘6’ on the same row.
- ‘6’ came from the top cell, so we jump to row ‘2’.
- Again ‘6’ came from the top cell, so we jump to row ‘1’.
- ‘6’ is different than the top cell, so we must include this item (which is item ‘B’).
- Subtract the profit of ‘B’ from ‘6’ to get profit ‘0’. We then jump to profit ‘0’ on the same row. As soon as we hit zero remaining profit, we can finish our item search.
- Thus the items going into the knapsack are {B, D}.
Let’s write a function to print the set of items included in the knapsack.
Challenge #
Can we improve our bottom-up DP solution even further? Can you find an algorithm that has space complexity?
We only need one previous row to find the optimal solution!
The
solution
above is
similar
to the
previous
solution,
the only
difference
is that
we use
i%2
instead
if
i
and
(i-1)%2
instead
if
i-1.
This
solution
has a
space
complexity
of ,
where
‘C’ is
the
maximum
capacity
of the
knapsack.
This space optimization solution can also be implemented using a single array. It is a bit tricky, but the intuition is to use the same array for the previous and the next iteration!
If you
see
closely,
we need
two
values
from the
previous
iteration:
dp[c]
and
dp[c-weight[i]]
Since
our
inner
loop is
iterating
over
c:0-->capacity,
let’s
see how
this
might
affect
our two
required
values:
- When
we
access
dp[c], it has not been overridden yet for the current iteration, so it should be fine. -
dp[c-weight[i]]might be overridden if “weight[i] > 0”. Therefore we can’t use this value for the current iteration.
To solve
the
second
case, we
can
change
our
inner
loop to
process
in the
reverse
direction:
c:capacity-->0.
This
will
ensure
that
whenever
we
change a
value in
dp[],
we will
not need
it again
in the
current
iteration.
Can you try writing this algorithm?